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Abstract 

This survey paper discusses the history of approximation formulas for n-th order deriva- 
tives by integrals involving orthogonal polynomials. There is a large but rather disconnected 
corpus of literature on such formulas. We give some results in greater generality than in 
the literature. Notably we unify the continuous and discrete case. We make many side re- 
marks, for instance on wavelets, Mantica's Fourier-Bessel functions and Greville's minimum 
Ra formulas in connection with discrete smoothing. 



1 Introduction 

In many applications one needs to estimate or approximate the first or higher derivative of a 
function which is only given in sampled form or which is perturbed by noise. Good candidates 
d for an approximation of the first derivative /'(x) are the two expressions 
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and 
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(1.1) 
(1.2) 



for 6 small. The first one is continuous, the second one discrete. These formulas have a long 
history going back to Cioranescu [H] (1938), Haslam-Jones [29] (1953), Lanczos jlSJ (5-9.1)] 
(1956) and Savitzky & Golay E 



(1964). What remains hidden in (1.1) and (1.2) is that the 
factor ^ in the integrand or summand can better be considered as an orthogonal polynomial of 



degree 1 with respect to a constant weight function on [—1, 1] (in case of (1.1)) or with respect 
to constant weights on {—N, —N -|- 1, . . . , A^} (in case of (|1.2|)). With this point of view it can 



be immediately shown that (1.1) and (1.2) tend to f'{x) as 5 ^0. Moreover the way is opened 



to a far reaching generalization of (1.1) and (1.2) for the approximation of higher derivatives 
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and with the involvement of general orthogonal polynomials. Such results were already given 
by Cioranescu [H] in 1938. 

Curiously enough, none of the later papers mentioned above is referring to one of the earlier 
papers. The results of Cioranescu [11] and Haslam-Jones [29] were hardly taken up by anybody. 
On the other hand Lanczos [42, (5-9.1)] and in particular Savitzky & Golay [57] had a lot of 
follow-up by others. One reason for this citation behaviour is probably that Cioranescu and 
Haslam-Jones were pure analysts, Lanczos was an applied mathematician working in numer- 
ical analysis, and Savitzky & Golay were motivated by spectroscopy, considered as a part of 
chemistry. 

It is the aim of the present paper to give a survey of these results, developments and further 
considerations suggested by them. Moreover, we formulate some results in a more general way 
than has probably appeared before in literature. It was for us a surprise to see that so many 
different parts of classical analysis and of applied mathematics are tied together by this theme. 
All papers until now only treated smaller parts of this wide field. We hope to share with our 
readers the pleasure to have a comprehensive view. 

The present work stems from a long practice by the first author in signal processing in applied 
situations, where he met the problem of differentiating an analog signal (and later a sampled 
signal) disturbed by noise, see for instance Strik [63]. The main problem occurring there was 
the difficulty to build (or program) an ideal differentiator, because the noise of the system will 
cause an instability. Therefore an integrating factor for the high frequencies to the differentiator 
is needed. When the signal is sampled the same problem occurs (Hamming [26]). Without 
being aware of the literature mentioned in the beginning of this Introduction, he tried to use an 
integrating factor by the method of the least squares and then he independently found special 
cases of the approximation formulas for higher derivatives by integrals involving all classical 
orthogonal polynomials as well as the Chebyshev polynomials of a discrete variable. He never 
published the results, but he used them as material for a course in stochastic system theory at 
the "Saxion Hogeschool" in Enschede. 

The contents of this paper are as follows. In section 2 we give preliminaries on orthogonal 
polynomials and on the Taylor formula. In section 3 we start formulating the approximation 
theorem in great generality and next discuss how the contributions of Cioranescu, Haslam-Jones 
and Lanczos are related to this general theorem. We emphasize the important role of least-square 
approximation behind this theory. Our discussion gives room for several side observations, for 
instance on Jacobi type polynomials and on wavelets. Section 4 is focused on the discrete case 
and the applications to filters. We start with a multi-term extension of the main theorem in 
section 3. Its special case of constant weights contains the seminal work of Savitzky & Golay. 
We introduce the characteristic (or transfer) function and we make connection with Mantica's 
|46j Fourier-Bessel functions. In the smoothing case we discuss the work by Greville [[21] (based 
on older work by Sheppard [M]) on so-called minimum and minimum R^o formulas. In the 
Appendix we give new derivations of the characteristic functions for these cases, using Hahn and 
Krawtchouk polynomials. The -Rqo case connects with another survey paper |39j by the second 
author and Schlosser. Finally, in section 5, we discuss log-log plots of transfer functions in some 
simple cases. 
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2 Preliminaries 



2.1 Orthogonal polynomials 

Let be a positive Borel measure on M with infinite support (or equivalently a nondecreasing 
function on M with an infinite number of points of increase) such that dfi{x) < oo for all 

n G Z>o. Consider polynomials p„ (n G Z>o) of degree n such that 

Pm(a;)Pn(2;) d^(x) = (m/n). (2.1) 

The pn are orthogonal polynomials with respect to the measure fi, see for instance Szego |64j . 
Up to constant nonzero factors they are uniquely determined by the above properties. If /j, has 
support within some closed interval I then we say that the pn are orthogonal polynomials with 
respect to fi on I. Typical cases of the orthogonality measure fj, are: 

1. diJ,{x) = w{x) dx on / with the weight function w a nonnegative integrable function on /. 
Then ( |2.1[ ) takes the form 

Pmix) Pn{x) w{x) dx = (m^n). 



2. fi has discrete infinite support {xq, xi, X2, ■ 
(weights) such that (2.1) takes the form 



.}. So there are positive numbers wo,wi,W2,-. 



^Pmixk)Pn{xk)Wk = (m^n). 



(2.2) 



k=0 



3. Contrary to what was supposed earlier, we can also consider the case that fi has finite 
support {xq, xi, . . . , xjy} with corresponding weights wq, wi, . . . , wn- Then we have or- 
thogonal polynomials pn only for n = 0, 1, . . . , and (2.1 ) takes the form 



N 



'^Pmixk)Pn{Xk)Wk = (m^n). 



(2.3) 



A:=0 



Special examples of case 1 are given by the classical orthogonal polynomials ( Jacobi, Laguerre 
and Hermite polynomials). In particular, we will meet the Legendre polynomials Pn, which are 
special Jacobi polynomials and where / = [—1, 1], w{x) = 1 and -Pn(l) = 1- 

A special example of case 3 are the Hahn polynomials x i— > Qnix; a, (3, N) for a = /3 = 
(n = 0,1,..., TV). Here Xi = i,Wi = I (i = 0,1, N) and Q„(0;0,0,iV) = 1 (see [Ml §9.5] 
and references given there, or |49| Ch. 2], where another notation is used). Hahn polynomials 
of general parameters were already introduced in 1875 by Chebyshev [1^, long before Hahn, 
but the above special case of constant weights is in particular named after Chebyshev, although 
in a slightly different notation and normalization. See Chebyshev's polynomials of a discrete 
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variable tn{x) = tn{x,N) (n = 0, 1, . . . , - 1) in [Ml §2.8], [El §10.23]. They are orthogonal 
on the set {0,1,..., — 1} with respect to constant weights 1. Hence we must have that 
tnix, N) = const. Qn{x; 0, 0, A^ — 1). The constant can be computed by comparing the recurrence 
relation [361 (9.5.3)] for a = /3 = and N replaced by — 1 with the recurrence relation [18^ 
10.23(6)]. Then we obtain: 

tn{x, N) = i-N + l)n Qnix; 0,0, N -I), (2.4) 

where (a)„ := a{a + 1) . . . (a + n — 1) is the Pochhammer symbol. Thus tn{0, N) = {—N + 1)„. 
These polynomials are also known as Gram polynomials, see [33\ §7.13 and §7.16]. This last name 
we will use in this paper. The polynomials x i— )• tn{x, N) have the shifted Legendre polynomials 
X I— >■ PnC^x; — 1) (orthogonal on [0, 1] with respect to a constant weight function) as a limit case 
(see [611 (2.8.6)]): 

lim A^"" tn{Nx, N) = P„(2x - 1) (2.5) 

N-¥oo 

For given orthogonal polynomials Pn define the constants /i„ and kn by 

/,„ := / p^i.f M^). = + terms of degree less tha,. n. (2.6) 

Jr 

Lemma 2.1. We have 

Pn{x)x^df,{x) = ^ (2.7) 
Proof We have /c„2;" = Pn{x) + qn-i{x) with q-n-i a polynomial of degree < n. Hence 

^ /p„W.«M.)= /p.W,„-,W*w = />„ + o = .„ n 

JR JR Jr 

One of the properties which characterize the classical orthogonal polynomials is that they 
are given by a (generalized) Rodrigues formula 

1 r/" 
KnW[x) dx" 

(see [m 10.6(1)]). Here X is a polynomial of degree < 2 and 

Kn = izl)^^ / {X{x)Td^^{x). (2.9) 

JR 



For the proof of (2.9) substitute (2.8) and d^{x) = w{x) dx in (2.7) and perform integration by 
parts n times. 



The explicit values of hn and kn defined by ( |2.6[ ) can be given in our two main examples: 
• Legendre polynomials Pn (see [361 (9.8.63), ((9.8.65)]): 

hn = -^, kn = 2-^(^''\. (2.10) 

2n + 1 Kn I 
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From this we immediately obtain the values of hn and kn in the case of shifted Legendre 
polynomials x i— )• PnC^x — 1): 



1 



2n + 1 



kn 



2n 

n 



Gram polynomials x H> tn{x,N) (combine (2.4) with jSH (9.5.2), (9.5.4)]): 



hn 



{N -n) 



2n+l 



2n + 1 



2n 

n 



(2.11) 



(2.12) 



From this we immediately obtain the values of hn and kn in the case of centered Gram 
polynomials x ^ tn{x + N , 2N + 1): 



{2N + 1- n)2n+i 
2n + 1 



kn 



2n 

n 



(2.13) 



The reproducing kernel for the space Vn of polynomials of degree < n in the Hilbert space 



L (M,/i) is given by 



K„(x,y):=2^ ^ (a:,yG 



fc=0 



Then (see [Ij Remark 5.2.2]) the Christoffel-Darboux formula gives 

/cn p„+i (a;)p„ (y) - pn {x)pn+i (y) 



Kn(x,y) 



kn+l hn 



x-y 



{x / y). 



The integral operator /C„ corresponding to (2.14) is given by 

(/C„/)(x):= / f{y)Kn{x,y)dfi{y) {f G L^R), x e R), 
It is the orthogonal projection of the Hilbert space L^(M, /i) onto Vn- In particular, 

ICnf = f ife Vn). 



(2.14) 



(2.15) 



(2.16) 



(2.17) 



Furthermore, for / G L^(]R, /i), /C„/ is the element of Vn which is on minimal distance to / (in 
the metric of the Hilbert space -L^(M,/x)). 

Later we will need the following. For Legendre polynomials formula (2.15) for y = 1 and 
with n replaced by n — 1 becomes: 



K 



n— 1 



xA) 



1 Pnix) - Pn-lix) 
2" —1 

i(P;(x) + P^i(x)). 



(2.18) 
(2.19) 



The Pi]l°^{x) in is a Ja cobi polynomial (see [Ml Ch. 4]). We used in the first 

equality, [Ml 10.8(32)] in ^J^, and ^ 10.10(13), 10.10(14)] in KM. 
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2.2 Taylor formula 

Recall a version of Taylor's theorem formulated by Hardy \28\ §151]: 

Proposition 2.2. Let x G M and let I be an interval containing x. Let f be a continuous 
function on I such that its derivatives of order 1,2, ... ,n at x exist. Then 



fiy) = Yl (y - + - asy^x onl. 



k=0 



kl 



(2.20) 



In this proposition the derivatives should be interpreted as right or left derivatives if x is an 
endpoint of the interval I. (Although this special case is not explicit in Hardy's formulation, it 
is also a consequence of his proof.) 

Proposition 2.2 suggests a notion more general than n-th derivative. Let / be a continuous 
function on an interval / containing x and let there be constants cq,ci, . . . ,Cn such that 



k=0 



as y — )■ X on I. 



(2.21) 



Then we call c„ the n-th Pea no derivative of / at x. This definition goes back to Peano 
in 1891. By Proposition 2.2 the existence of f^"\x) implies the existence of the n-th Peano 
derivative, equal to f^^\x). The converse implication is true for n = 1 but not for n > 1, see a 
counterexample in [19^ Example 1.2]. 

For later use we restate Proposition 2.2 as follows: 



Proposition 2.3. With the assumptions of Proposition 2.2 we have 

/(x + 5) = 5'= + <5"F,,4<5) 



A:=0 



(2.22) 



with Fx^n continuous on I — x and F^^niO) = 0. Furthermore, Fx^n is bounded on L — x if f is 
hounded on L . Finally, if L is unbounded and f is of polynomial growth on L then F^^n is of 
polynomial growth on I — x. 



We can say more about the remainder term in (2.22) if moreover / G C^{L) (see for instance 
Apostol PI Theorem 7.6]): 



Proposition 2.4. Keep the assumptions of Proposition 2.2. Assume moreover that f G C^{I). 
Then for Fx ^n{^) in (2.22) we have 



Fx,n{S) 



1 



{n-iy. Jo 



(/(n)(^ + t5)_/(n)(^)) (l_t) 



(n)/ 



\n-l 



dt 



(2.23) 



and the function (x,y) i— )• Fx^niu — x) is continuous on L x L. If f^^^ is of polynomial growth on 
I then Fx^n{S) — > as 5 — )■ uniformly for x in compact subsets of I. 
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3 Higher derivatives approximated by integrals 



Let us first state and prove the main theorem and next discuss the many instances of it in the 
Hterature, usually more restricted but occasionally more general than our formulation. 

Theorem 3.1. For some n let pn be an orthogonal polynomial of degree n with respect to the 
orthogonality measure /i. Let x E M. Let I he a closed interval such that, for some e > 0, 
x + (5^ £LifO<d<£ and ^ G supp(/i). Let f be a continuous function on I such that its 
derivatives of order 1,2, ... ,n at x exist. In addition, if I is unbounded, assume that f is of at 
most polynomial growth on I. Then 



f 



in) I 



Irni ■ 



hn sio 6^ 
where the integral converges absolutely. 



fix + 60PniOdm, 



(3.1) 



Proof If I is bounded then the integral in (3.1) converges absolutely by continuity of /. If 
I is unbounded then, for fixed 6 G [0,e], we have for some r > that f{x + (5^) = 0(|^| 
^ — )• iboo on supp(/i). So also in that case the integral in (3.1) converges absolutely. 



as 



1 



By substitution of (2.22), by orthogonality and by (2.7) we have: 



/(x + <50Pn(0^^M0=E 



fc=0 

/W(x 



kl 



Jm 



fin) 



knUl 



ePn{Odm+ I F,,n{SOCPn{^)dfl{^) 
JM. 

/^"^(x)+ / F,,ni60CPniC)dm- 



Thus the theorem will be proved if we can show that 



lim / F,,nmePn{Odfi{C) = 0. 



(3.2) 



By the second part of Proposition 



2.3 



we have the estimate \Fx^n{h)\ < C(l + \h\Y {h £ I — x) 
for some C > 0, r > 0. Hence, for 5 G [0,e] and ^ G supp(^) we have the estimate |-Fx,n('^C)l ^ 
C(l + e|^|)^'. Thus, the dominated convergence theorem can be applied to the left-hand side of 
(|3.2|). Then, again by Proposition 2.3 it follows that (3.2) is true. □ 



Note the following special cases of (|3.1|). 



Gram polynomials x i— )• tn{x,N) (use (2.12[)): 



(3.3) 
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Centered Gram polynomials x i— )• + N,2N + 1) on 2N + 1 lattice points (use 
In particular, for n = 1: 



2N{N + ^){N 



o 1 ^ 

^ lim - y /(x + <5n 6 



(3.5) 



Analogues of (3.4) and (3.5) might also be given for the centered Gram polynomials x i— )• 
tn{x - N + 2, 2iV) on 2N lattice points. 



Legendre polynomials P„ (use (2.10)): 

/(")(x) 
In particular, for n = 1: 



> + !)! 1 
— — lim — 



/(x + 50 ^n(e) 



lim J [' f{x + 50^d^. 



(3.6) 



(3.7) 



• Shifted Legendre polynomials x i— > Pn{2x — 1) (use ( 2.1l| ): 

^ (2n + l)! 1 ^ ^ ^ _ 
n! 54,0 0" Jo 

3.1 Cioranescu's 1938 paper 



(3.8) 



A variant of Theorem 3.1 was first stated and proved by Cioranescu |lll formula (M')] in 1938 
for the case that dfi{x) = w{x) dx is absolutely continuous with bounded support within an 
interval [a, 6]. He showed for / G C"'([a, 6]) that there exists r/ G (a, h) such that 



^, Lfiy)Pn{y)w{y)dy ^ 
iay'^Pn{y)w{y) dy 



(3.9) 



Then he took limits for 5 J, a in the left-hand side of (3.9) (see |11^ formula (9)]) with p„ 
remaining an orthogonal polynomial on the shrinking interval [a, h] with respect to the weight 
function w restricted to [a,h]. The limit on the right-hand side of (3.9) then becomes f^^\a). 
In general, this limit formula for f^'^\a) will not be contained in (3.1) since the weight function 
(after rescaling it to a fixed interval) will not remain the same during the limit process. But 
Cioranescu's limit result in the case of shifted Legendre polynomials is the same as (3.8). The 
case n = 1 of (3.8) is explicitly mentioned by Cioranescu (see [HI formula (9')]). 
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3.2 Substitution of the Rodrigues formula 

For classical orthogonal polynomials Pn and for / G C"{I) with /("^ of polynomial growth on / 



we can also prove (3.1) by substituting dfi{x) = w{x) and the Rodrigues formula (2.8) together 

irts n times: 



with (2.9), and by performing integration by parts n times: 



kr,,n\ 



/("^(a;) as 5 10. (3.10) 



In the Legendre case w{x) = 1 this was already observed by Cioranescu [11^ p. 296]. 
3.3 Haslam-Jones' 1953 paper 



Next Theorem 3.1 for the case that n has bounded support, was observed (with proof omitted as 
being easy) in 1953 by Haslam-Jones \29\ p. 192], who was apparently not aware of Cioranescu's 
result. In fact, in his formulation the measure /u only has to be real, not necessarily positive. 



Furthermore, / only has to be continuous with an n-th Peano derivative at x (see (2.21 



Note that our proof of Theorem 3.1 can be used without essential changes under the weaker 
hypotheses of Haslam-Jones. 

In fact, the assumptions in [29j are still weaker. Haslam-Jones assumes, for given n > 0, a 
real, not necessarily positive measure v with bounded support (or equivalently a function of 
bounded variation) on a finite interval J such that fj x^ dv{x) = for /c = 0, 1, . . . , n — 1 and 
jj x" dv{x) = K ^ 0. Then for a function / which is continuous on a neighbourhood of x and 
has n-th Peano derivative Cn in x we have 



ni 



1 



hm- / f{x + 60du{0. 

K (54.0 o" J J 



(3.11) 



Again this can be proved as we did for Theorem 3.1, without essential changes. 



3.4 A special case of Haslam-Jones' results 



We will consider here a special case of (3.11 ) which is essentially different from Theorem 3.1 Let 



{Pm} be a system of orthogonal polynomials on [—1, 1] with respect to a positive Borel measure /U. 



Let Kfn{x, y) be the corresponding Christoffel-Darboux kernel given by (2.14), (2.15). Fix n and 



define the measure u in (3.11) by 



(3.12) 



Indeed, by the reproducing kernel property the right-hand side of (3.12) equals if / is a 



polynomial of degree < n, while for /(^) := the right-hand side of (3.12) becomes 



c-\^-l)Kn-l{^^)d^Ji{i) 



fcn-lPn(l) 



-1 



k„ 
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Thus for this case (3.11) becomes 



n\kn ,. 1 
— 7— nm — 

Pn{l) SIO 5" 



-1 



(3.13) 



In particular, take dfi^S,) := d(, on [—1,1]. Then substitute (2.19), by which (3.13) takes the 
form 

1 



1" i' 1 ' 



lim ^ 



m 



I fix + SO (p;(e) + p^i(o) de) . (3.14) 



For this case Haslam-Jones (29| showed that if the hmit on the right of (3.14) exists then the 



n-th Peano derivative of / at a: exists and it equals Cn given by (3.14). A different proof of this 
result was given by Gordon [21]. 



3.5 Connection with Jacobi type orthogonal polynomials 



For a larger family of examples than (3.14) consider formula (3.12) with dfi{x) := (1 — x)°(l + 
xf+'^dx (a,/3 > -1). Then p^{x) = P, 
(4.5.3)] we obtain that 



(x), a Jacobi polynomial (see ^64* Ch. 4]). From 



r(n + Q + /3 + 2) 



Kn-i(x, 1) 2«+/3+2r(a + l)r''(n + /3 + 1) ^""^ 



(x). 



Then the vanishing of the right-hand side of ( 3.12[ ) for polynomials / of degree < n can be 
written more explicitly as 



r(a + /3 + 2) 



2a+/3+ir(a + l)r(/3 + 1) 



(1 + -)^-! 



2P 



n-1 



(1) 



:i-x)"(l + x)^dx 

(/3 + l)„.(n-l)! 
" (a + /3 + 2)„(Q + 2)„_i 



/(I) = 0. 



Hence, for fixed n, the polynomial (l + x)P^°^^''''^"'^^(x) is the n-th degree orthogonal polynomial 
with respect to a measure on [—1, 1] consisting of the weight function (1 — x)"(l + x)^ and a 
negative constant times a delta weight at x = 1. On comparing with [371 Theorem 3.1] (extended 
to negative multiples of delta weights by analytic continuation) we can identify this orthogonal 
polynomial with a Jacobi type polynomial: 



p(-'^-'''^\x) = c{l + x)Pf, 



where 



(/3 + l)„(n-l)! 
(a + /3 + 2)„(a + 2)„_i 



and 



P("'^^°'^^(l) 
2Pt',''^'-'\l) 



a + 1 
2n 
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This corresponds correctly with [37' (2.1)], which simplifies for M = and the above choice of 
to 

n(n + a + p + 1) ax 
The above formulas extend by continuity to the case /3 = — 1, which occurs if dfi{x) = dx (the 



case considered in (3.14)) 



3.6 Lanczos' 1956 work 

In a book published in 1956 Lanczos |42| (5-9.1)], apparently unaware of the earlier work by 
Cioranescu [TT] and Haslam-Jones rediscovered formula (3.7). He called this differentiation 



by integration. His work got quite a lot of citations, see for instance [25], |60], [32], [5l], [8], 
[S], [66j. The name Lanczos derivative, notated as /^(x), became common for a value obtained 



from the right-hand side of (3.7). In [54j and ^ also (3.6) (the Legendre case for general n) was 
rediscovered. 

As an important new aspect Lanczos observed that 



3 

25 



f{x + 5C)idi = 

1 



as an approximation of /' (x) , is the limit as — )• oo of the quotient of Riemann sums 



2N{N +h{N + 1) 6 



N ^ 



:=-N 



f{x + N-'50C- 



We have seen this last expression already in (3.5 ) (the special case of (3.1 ) with a centered Gram 
polynomial of degree 1). The expression approximates f'{x) for N~^6 small. In fact, (3.5) was 
the starting point of Lanczos, see [121 (5-8.4)]. 



3.7 Interpretation by least-square approximation 



Lanczos \42\ (5-8.4)] arrived at (3.5) by a least-square minimization (a technique invented by 
Gauss and Legendre, see for instance |16j). For this compare the function ^ i— ?• f(x + 5^) with a 
linear function g{^) := oq + oi^ such that the squared distance 



N 



S{ao,ai):= ^ (/(^ + 5^ - 5(0)^ 



-N 



E {fix + v)-{ao + 6-'a^v)y 

-5N-5(N-1),...,SN 



is minimal. Then the slope d^^ai of the straight line 1— )• oq + 6~^airi minimizing the distance 
will approximate f'{x) for small 6. The minimum is achieved for a unique (ao,ai), where one 
finds oi in this simple case already by solving 7^5(00,01) = 0. Thus Lanczos obtained 



N 



N 



-N 



25N{N +1){N + 1) 



-N 
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and he thus arrived at (3.5). 



We can interpret (3.1) as a more general least-square approximation. By the assumptions 
on / in the Theorem 3.1 the function ^ i— )• f(x + ^5) is in L^(]R, /u) for each 6 £ [0,e]. Let 
^ I— 7- Pn,x.s[f]iO be the polynomial of degree < n which is on minimal distance from the function 
f{x + CS) in the Hilbert space L^{R, /i). Then 



" 1 r 

k=0 ^ -^^ 



(3.15) 



Then 



n-n JR rin 



approximates f^^\x) as 5 | 0. Thus we arrive at (3.1). 
Also observe that clearly 



fix + 50PniOMO= / PnMm)PniOdf,iO 



Hence, we can rewrite (3.1) as 



f(")i 



X] 



knTll 1 

— — hm — 



(3.16) 



For the Legendre case (3.6) this interpretation by least-square approximation was given in [8j. 



But earlier, in 1990, Kopel & Schramm [H], apparently unaware of any predecessors, arrived at 
the case n = 1 of (3.8) while they were guided by least-square approximation. 



3.8 Even orthogonality measures 



We can refine Theorem |3.1| if we assume that the orthogonality measure fi considered there is 
even, i.e., that dfi{—x) = d^{x). Then the corresponding orthogonal polynomials pn are even 
or odd according to whether n is even or odd, respectively. The simplest example is give n by 
the Legendre polynomials. Now also modify the assumptions about (x) in Theorem 3.1 
Only assume that the right n-th derivative f^\x) and left n-th derivative f^\x) exist. Then 



by an easy adaptation of the proof of Theorem 3.1 we get (3.1) with on the left-hand side the 
symmetric n-th derivative of / at x: 



(n). 



knnl 

hn 



lim — 



f{x + 60PniOdm- 



(3.17) 



The special case of this result for Legendre polynomials (see (3.6), (3.7)) was observed in |^5l 
Proposition 1] for n = 1 and in [8^, Theorem 2] for general n. The special case of (3.17) for 



centered Gram polynomials (see (3.4)) was observed in [9, Theorem 1.1]. 
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Moreover, in [HI pp. 370-371] and [Ml §4] examples were given for the Legendre case with 
n = 1, where the hmit on the right-hand side of (3.7) exists, but the left and right derivative of 
/ at X do not exist. Earher, in \^T[ pp. 231-232] an example was given where the limit of the 
right-hand side of (3.8) for n = 1 exists, while the right derivative of / at x does not exist. 

Consider the proof of Theorem 3.1 if we know that f'^^\x) exists for k up to some m > n. 
Then 



1 



/<»)(.)+ 1 -^'"(^'*"" 

k=m+l 



kr,.n\ 



kl 



hr, 



kr,nl 



/(")(x) + 0((5) as(5|0. 



We can say more if moreover /i is an even measure. Then J^S,""^^ PniO dfJ-{£,) = and thus we 
have, as (5 I 0, 



1 



fix + SOPniOdm 



kr,.n\ 



o{5) if /("+^) (x) exists, 
0{5'^) if /("+2)(x) exists. 



(3.18) 



The Legendre case of (3.18) was observed for n = 1 in [42^ (5-9.3)] and for general n in [54^ (4)]. 
3.9 Generalized Taylor series 

kr,n\ , 1 



Rewrite (3.1 ) as 



/ 



in). 



lim 



f{x + C)pni5-'Odl^5iO- 



(3.19) 



Here ^is{E) := fJ,{6~^E). Then the polynomials x i— )• p„((5^^x) are orthogonal with respect to 
the measure /x^. The formal Taylor series 



^/W(^ 



(3.20) 



n=0 



can accordingly be seen as a termwise limit of the formal generalized Fourier series 



E/rf/ fi^ + OPn{S-'C)d^is{C))pn{v) 



n=0 



1 f kr,n\ 



f{x + OPn{S-'0 df^siO 



5^Pn{6-^n) 



(3.21) 



Indeed, use (|3.19[) and the limit (5"pn(5 ^r])/kn — )• as 5 | 0. 



While for a big class of orthogonal polynomials and for / moderately smooth, the series 



(3.21) converges with sum /(x -|- ry) because of equiconvergence theorems given in Szego [& 
Ch. 9 and 13], we will need analyticity of / in a neighbourhood of x for convergence of (3.20) to 
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f{x + rj). In the case of Jacobi polynomials and for / analytic on a neighbourhood of x we can 
use Szego [64] Theorem 9.1.1]. Then an open disk around x of radius less than the convergence 
radius of (3.20) is in the interior of the ellipse of convergence (3.21) for 5 small enough. This 
was discussed for Legendre polynomials by Fishback |20j . 

A different limit from orthogonal polynomials to monomials is discussed by Askey & Haimo [1] . 
This involves Gegenbauer polynomials, which we write as Jacobi polynomials Pn'^'"^(x) = 



UV^'UL) n _|_ . . . . 
f^n X -f 



lim 



{a, a) 



(x) 



z,(°.") 



(3.22) 



Consider now the formal expansion of /(x + ij) for ij G [—1,1] in terms of the polynomials 



(a, a) 



(ry): 



E 

n=0 

oo 

E 

n=0 



ni 



-1 



n! 



2\n+a 



where we used the identitity for 5 = 1 in (3.10). The last form of the above formal series tends 
termwise to the formal Taylor series (3.20). This is seen from (3.22) and the fact that the 
measure (1 -,^^)'^+° d^/ /^^(l -^^)""'"" tends to the delta measure as a — )• oo (see [H p. 301]). 
As observed in |4t p. 303], the function / has to be increasingly smooth as a grows in order to 

have convergence in the expansion of /(x + rf) in terms of P^'°'\ri). 



3.10 Connection with the continuous wavelet transform 

The continuous wavelet transform (see for instance [H], |38| ) is defined by 

{^gf){a,h) := laj-i / f{t)g{a-\t-b))dt (/ G L^R), a,6 G M, a / 0). (3.23) 

Here we will take the wavelet g as a nonzero function in {L^ n L^)(]R) such that f^g{t) dt = 0. 

For orthogonal polynomials Pn{x) let the orthogonality measure have the form dfi{x) = 
w{x) dx (essentially item 1 in { 2.1 ) with w{x) > for x G M and with the functions x i— )• x" w{x) 
in (L^ n L2)(M) for all n = 1, 2, . . . . Put 



gn{x) := pn{x)w{x). 



(3.24) 



Then, for n = 1, 2, . . . the functions gn are in (L^ n L^)(M) and satisfy j^gn{t) dt = 0. Now 
consider the continuous wavelet transform ( 3.23| ) for g equal to such g^ and compare with (3.1). 
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Then 



f{x + ^6)pn{0w{0dC = S-' [ f{t)gn{6-Ht-x))dt = 6-'H^gJ){6,x). (3.25) 



Hence, (3.1) can now be written as 



(3.26) 



A similar observation about the continuous wavelet transform approximating the n-th deriva- 
tive was made by Rieder |55l (5)] in the case of (3.23) with g having its first n moments equal 
to zero. In fact, he recovers formula (3.11), first obtained by Haslam-Jones [29], for v absolutely 
continuous, /, c„ being the n-th distributional derivative of /, and the limit taken in a suitable 
Sobolev norm (see also j55i Theorem 2.3]). An example of a wavelet g having its first n moments 
equal to zero is Daubechies' wavelet ^■0 of compact support, see [13, pS 



The continuous wavelet transform ^g^^ with gn given by (3.24) and w{x) being a weight func- 
tion for one of the classical orthogonal polynomials (Jacobi, Laguerre, Hermite) was considered 
by Moncayo & Yahez ^48j. 

3.11 A special case: the n-th order finite difference as approximation of the 
n-th derivative 



For iV = n -|- 1 we can see that (3.3) specializes as an n-th order finite difference approximating 
the n-th derivative. Indeed, write (3.3) as 

/W(x)=lim(Z)„,5,^/)(x), 



.54,0 



where (see also (2.4) and [361 (9.5.1)]) 



(2n + l)!(Ar-l)!(-l)' 
n! {N + n)\5'' 



N-l 



5=0 



-n,n+ 1, 



-N + l 



Now put iV := n -|- 1 and use that for ^ = 0, 1, . . . , n we have by [5_0, (15.2.4)]: 

^ (-Qfc (n + l)fc ^ 



3F.( -';^ + l'-^;l 

1, — n 



A;=0 



k\k\ 



1 



n)^ 



(-1)« 



Hence 



where 



fix + 6) -fix) 
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4 Filters for higher derivatives 



The following theorem is a multi-term variant of Theorem 3.1 



Theorem 4.1. Let {^'A;}fc=o,i,2,... be a system of orthogonal polynomials with respect to the or- 
thogonality measure ^. Let m,n be integers such that < m < n. Let x G M. Let L be a closed 
interval such that, for some £>0,x + 6S,£LifO<5<e and ^ G supp(^). Let f be a 
continuous function on I such that its derivatives of order 1,2, ... ,n at x exist. In addition, if 
I is unbounded, assume that f is of at most polynomial growth on I. Then 



as (5 J, 0. 



(4.1) 



If f G C^{I) and /^"^ is of polynomial growth on I then (4.1) holds uniformly for x in compact 
subsets of I. 



Note that (4.1) turns down to (3.1) if m = n. 



Proof With the notation (2.14|) we can rewrite the left-hand side of (|4.1|) as 



m 

dr] 



fix + 5()Kn{C,rj)df,iO 



ri=0 



(4.2) 



By substitution of (2.22) and by application of (2.17) the expression (4.2) is equal to 



n-m r(m+l)(^\ 



1=0 



V. 



r]=0 



drj 



rF,,„(<^e)K„(e,??) dfiio 



ri=0 
5 xm 



dr] 



ri=0 



Now use the same argument as at the end of the proof of Theorem |3.1[ calling Proposition 2.3 
and using dominated convergence, in order to show that 



hm [ eF.,nm 

Jr 



drj 



r]=0 



For the proof of the last statement also use Proposition 2.4 



Remark 4.2. In view of (3.15) we can rewrite (4.1) as 
1 



/(-)(x) 



Sm drj-^Pr^,.AfM ^^^ + o{d^~n as no (m = 0, 1, . . . , n). 



□ 



(4.3) 



If / is a polynomial of degree < n then (4.3) holds exactly without the term o{6"' ™) because 

Pn,xAfM = f{x + Sr,). 
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For / having derivatives at x up to order n + 1 we can refine (4.1) as follows. 



Proposition 4.3. Keep the assumptions of Theorem 4-1- Moreover assume that (x) 
exists. Then 



j=m ^ 



'f ^5«-™+i + o(5-™+i) as 5 10. (4.4) 

kn+i [n + iy. 



If f G C'^~^^{I) and /("+-'^) is of polynomial growth on I then (4.4) holds uniformly for x in 
compact subsets of L 



Proof Write the left-hand side of (4.4) as 



n + 1 and apply (2.17). Then the expression (4.2) becomes 



4.2^. Then substitute (2.22) with n replaced by 



f{n+l)( \ 
f("^)(^\ _|_ 1 xn-m+l 



d_ 
drj 



+ S 



[ r+'K„(e,r?)dMe) 

JR 



srn—m+l 



By a similar argument as in the proof of Theorem 4.1 we see that the last term equals o{6 



rn— rrl+l^ 



as (5 I 0. By application of (2.17) the second term becomes 



/ 8 

(n + 1)! \dr] 



Pn+liO Pn+l{r]) 



h 



n+1 



dm 



ri=0 



hn+1 (n + 1)! 



in— m+1 



r+Vn+i(e)d/"(e). 



Then (4.4) follows by using (2.7). 



□ 



4.1 Even orthogonality measures 

If in Proposition 



4.3 



the orthogonality measure n is even and if n — m is odd then pj™^ (0) = 
whenever j — m \s odd, so the sum on the left-hand side of (4.4) then runs over j = m,m + 
2, . . . , n — 1. Moreover, for p^'^\(0)/fen+i occurring on the right-hand side of (4.4) we then have 
that 



K+l 



(4.5) 



and for x large, p^i{x) > 0. Also note that p^^i is a polynomial of degree n + 1 — j which is 
even or odd according to whether n + 1 — j is even or odd. Since Pn+i belongs to a family of 



In order to prove (4.5) we can assume without loss of generality that > 0. Then, for all j 



(j) 
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orthogonal polynomials, it has n + 1 simple real zeros. The number of positive zeros is (n + 1) /2 
if n + 1 is even and n/2 if n + 1 is odd. Now it follows by induction with respect to j that p^+i 
has (n + 1 — j) /2 positive zeros if n + 1 — j is even and (n — j)/2 if n + 1 — j is odd. So we arrive 
also at this property for j = m, and then (4.5) readily follows. 

Thus, for fi an even measure and n — m an odd number we can rewrite (4.4) as follows for 
(510. 



1 



E 

j=m,m+2,...,n—l 



p) '(0) 



^_-|^j(n-rra+l 



)/2 



\kn+i\ (n + 1)! 



m+l 



^0(5"-""+^) 



if (x) exists, 

if exists, 
if exists. 



(4.6) 



This is proved by a slight adaptation of the proof of Proposition 4.3 Furthermore, (4.6) holds 



uniformly for x in compact subsets of / if the appropriate derivative of / of order n + 1, n + 2 
or n + 3 is continuous and of polynomial growth on /. 



Remark 4.4. Note that for fixed m the approximation of the left-hand side of (4.6) (and 
earlier formulas (4.1), (4.3) and (4.4)) to f^"^\x) becomes better as n increases. However, this 



observation disregards the frequency spectrum of the signal / and the effect of noise. See Remark 



4.5 for a discussion of these aspects. 



4.2 Filters 



We can consider the left-hand side of (4.1) as a filter (continuous or discrete depending on the 
choice of /i) for m-th order differentiation at x. In general, a continuous respectively analog 
filter sends an input function / to an output function g by convolution with a fixed real-valued 
function p: 



N 

9{y) = ^ f{y - x) p{x) (yez), 

x=M 
i-N 

g{y)=l f{y-x)p{x)dx (yeM). 

JM 



(4.7) 
(4.8) 



Here M may be — oo and N may be oo. Filters are widely used in electrical engineering, with 
analog filters being continuous and digital filters being discrete. There y is usually the time 
variable t and instead of p one writes /i, the unit impulse response. If M and are finite 



in (4.12), one speaks about a finite impulse response (FIR) filter, otherwise about an infinite 
impulse response (IIR) filter. The Fourier transform (/> of p is called the characteristic function 
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or transfer function (also denoted by H) of the filter: 



N 



[uj] := > p(x] e 

x=-M 
(■N 

{io) := / p{x) e"^^^ dx. 



(4.9) 
(4.10) 



Equivalently, (/> equals the quotient g/ f oi the input function g and the output function / if 



f{y) ■•- 



Standard books of digital filter theory are for instance |2j and [2?]. In ^ p. 306] 



methods are described for the design of digital differentiators. 



4.3 The characteristic function 



We continue with the left-hand side of (4.1) considered as a filter. Let us assume that /i is an 
even measure and that n — m is odd, so that we can work with (4.6). We obtain the characteristic 
function (j) of the filter defined by the left-hand side of (4.6) if we put there f{^) := e*"^^ with 
and take x 



1 



0: 

E 

j=m,m+2,...,n—l 



with G a bounded function on 
/ bounded). 



IpSUo)I 



P) (0) 



\kn+i\ (n + 1)! 



,n— m+l 



+ {5uj 



,n— m+3 , 



GiSoj)), (4.11) 



(for the proof of the second equality use Proposition 2.3 with 



Remark 4.5. From the last part of (4.11) we see that for differentiation with fixed order m 
the first term gives the characteristic function {iuj)^ of the ideal differentiator. The second 
term has degree n -|- 1 in w and gives rise to a falling down of the characteristic function, since 
the coefficient has negative sign. So for high frequencies the filter will be a low pass filter and 
for low frequencies the filter works well for differentiat ion. Increase of n brings the filter in a 

for approximation to f^"^\x) in the 



4.4 



sense closer to the ideal differentiator (see also Remark 
x-domain), but the pass band will also increase, causing more high frequency noise (see section 
5.2 and Figure [l] for an example in the Legendre case). In the practice of the construction of a 
differentiating filter one has to decide to what frequency the differentiation must do the job and 
how much noise one accepts. This all depends on the frequency contents of the signal and the 
noise. See also the discussion for the case of constant weights by Barak [5, p. 2761] (for m = 0) 
and by Luo et al. |44l §5] (for general m). 



4.4 Smoothing filters 



For m = the filters given by the left-hand sides of (4.1) and (4.6) are examples of smoothing 



filters. These have a very long history, see Schoenberg [58], |59| and references given there, which 
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= -N in ( [47| ) and M 

{y G Z), 



-iV 



— 1 in 

(4.12) 
(4.13) 



go back as far as De Forest's work in 1878. We put M 

N 

9{y) = ^ fiy- x)p{x) 

x=-N 

g{y) = J f{y-x)p{x)dx 

and we usually take p symmetric: p{x) = p{—x). 

We say that (4.12) or (4.13) is exact for the degree j (where j < 2N in case of (4.12)) if 
g = f whenever / is a polynomial of degree < j, but g ^ f for some polynomial / of degree 
j + 1. Because of symmetry of p, such j will always be odd. 

Exactness of (4.12) for degree at least 2n + 1 < 2N can equivalently be stated as 

N 

p{x) = Y,Ck t2k{x + N,2N + 1) {x e {-A^, -N + 1,...,N}) 

with 



k=0 



Ck = t2k{N, 2N + l)/h2k if A; = 0, 1, . . . , n, 
where y i— t- t2k{y, 2A^+ 1) is a Gram polynomial (see (|2.4[)) and h2k is the corresponding constant 



given by ( |2.6[ ). For such p the sum of squares '^x=-N Pi^)^ minimal if and only if Cfc = for 
n<k< N. Then 



p{x) = K2„(a; + N, N) (x G {-N, -N+l,..., N}), 



(4.14) 



where K.2n is the Christoffel-Darboux kernel of degree 2n (see (2.14)) for the orthogonal poly- 
nomials y i-> tk{y, 2N + 1). 

Similarly, in case of (4.13) and assuming that /? is a polynomial, the requirements that the 
formula is exact for the degree 2n + 1 and that f^-^ p{x)^ dx is minimal are equivalent to 

/5(x) = K2„(x,0) (xG[-l,l]), (4.15) 

where K2n is the Christoffel-Darboux kernel of degree 2n for the Legendre polynomials Pk- 
More generally than (4.14) we can work with orthogonal polynomials p„, satisfying (2.2) or 



(2.3) for equidistant points running over a symmetric set and with symmetric weights: 

N 

Pni{x) pn{x)w{x) = Q (m ^ n), 



-N 



where w{x) = w{—x). In terms of these polynomials pn exactness of (4.12) for degree at least 
2n -|- 1 < 2N can equivalently be stated as 



N 



p{x) = CkP2k{x) w{x) {x E {-A^, -N + l,...,N]) 



(4.16) 



fc=0 
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with 



In particular, the choice 

p{x) := ^2n{x,f))w{x 



Cfe = P2fc(0)//i2fc if A; = 0, 1, . . . ,n. 

fc2nP2n(0) P2n+l{x)w{x) 



k2n+ih: 



2n 



(4.17) 



(4.18) 



with K2ra the Christoffel-Darboux kernel and with (2.15) used for the second equality, will make 



(4.12) exact for the degree 2n + 1. 



respectively (4.10) with M 



The characteristic function for (4.12) respectively (4.13) is defined by (4.9) with M = —N 

—N = —1. The condition that (4.12) or (4.13) is exact for the 



degree 2n + 1 is equivalent with the condition that has power series of the form 

(A(w) = 1 



(4.19) 



with a ^ 0. 

An m-fold iteration of (4.12) (with N = oo for convenience) yields 

oo 

9m{y)= ^ f{y-x)pm{x) (y G Z), 

x=—oo 

where pm = . . . *p is an m-fold convolution product. De Forest (1878) raised the question for 
which choices of p the asymptotic behaviour of Pm{x) for large m can be described. Schoenberg 
|58t Theorem 1 and Remark 1 on p. 358] showed that this is possible precisely if the characteristic 
function satisfies 

\(t){uj)\<l for0<w<27r. (4.20) 



If (4.20) is satisfied then the smoothing is called stable. Clearly (4.20) will imply that (4.19) 



holds with a > 0. For p given by (4.18) we see from (4.11) that a > is satisfied for any choice 



of the weights and that a is explicitly given by 

\p2n+2 



\k2n+2\ (2n + 2)! 

The stability condition ( 4.20| ) can also be considered for the continuous smoothing formula 



(4.13), where now lo € K\{0} in (4.20). For the Legendre case where p is given by (4.15) 
stability was proved by Trench [65j and Lorch & Szego 



4.5 Fourier-Bessel functions 



As a common generalization of (4.12) and (4.13) with p given by (4.15) and (4.18), respectively, 



we can consider a smoothing formula 

9{v)= I f{y - x)r{x)dp{x) 



(4.21) 
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with n an even positive orthogonality measure for the orthogonal polynomials pn and with r 
given by 

P2ji0)p2jix) _ k2nP2niO) P2n+l{x) 



ix) :=K2n(a;,0) = ^ 

j=0 



h 



2j 



2n 



X 



(4.22) 



(Such usage of the Christoffel-Darboux formula was emphasized in [53J for the cases of Gram 
and Legendre polynomials.) Then we can define the corresponding characteristic function by 



Thus 



[ r(x) e"^^"^ 


dfi{x) 


y-P2,(0) 1 


P2j{x) e"'^'^ 


k2nP2niO) 1 


' P2n+l(x) ^_ 


k2n+lh2n J I 


? X 


k2nP2n{0) j 
ik2n+lh2n J 


P2n+l{x) e~ 

R 



(4.23) 



' dfi{x) 



(4.24) 



In the Legendre case dp,[x) = dx with support [—1,1] we can evaluate integrals occurring above 
in terms of (spherical) Bessel functions as follows: 



Pn{x) e-''''^ dx = r 



-1 




J„,i(a;) = 2z-"i4a;), 



(4.25) 



see ISni (18.17.19), (10.47.3)] or ^ (4)]. In the Chebyshev case dfi{x) = (1 - x'^)"'^/'^ dx, 
Tn{cos9) := cos(n^) we similarly obtain (see [501 (10-9-2)]) 



TT-i / r„(x)e-"'^dx = i~" J„(a;). 



Formula (4.26) was the reason for Mantica |45j . |46j to call the functions 

Jnioj;p) := [ pn{x) e-'""^ dfi{x) 



(4.26) 



(4.27) 



Fourier-Bessel functions (where he took pn orthonormal and /i a probability measure). The 
same functions occur in Ignjatovic [34J as the functions /C"[m] (in the notation of |34^ §2.1]). 



Formula (4.25 ) played an important role in the proof of the stability result in the Legendre case, 



see [13]. It also occurred in Rangarajan et al. [54, (14), (15)] for a formal operational calculus 



in connection with the right-hand side of (3.6) before taking limits. In the Appendix we will 



compute the Fourier-Bessel functions for the case of the shifted symmetric Hahn polynomials 
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4.6 Stability of smoothing in case of symmetric Hahn and Krawtchouk poly- 
nomials 



The shifted symmetric Hahn polynomials 

Pnix) := Qn{N + x;a,a,2N) (n = 0, 1, . . . , 2iV) (4.28) 
are orthogonal polynomials on {—N,N + 1, • • • , N} with respect to the symmetric weights 

(4.29) 



(g + l)Af+^ (q + 1)n-x 
{N + x)l {N-x)\ ' 



see \36\ (9.5.1) and (9.5.2)]. Assume that a is a nonnegative integer. Consider in terms of these 
polynomials pn formulas ( |4.16 ) and (4.17) characterizing exactness for degree at least 2n + 1. 
Now observe that, by [361 (9-5.9)], we have 



(2iV + l) 



a 



Qn+a{N + X + a; 0, 0, 2iV + a), 



where A^(/(x)) = (A/)(x) := f{x + 1) - f{x). It follows that 

N 



(4.30) 



x=—N—a 



is minimal for p given by (4.16) and (4.17) if and only if Ck = for n < k < N . Greville [24, §3] 
(1966) denotes (4.30) by -R^ (after division by i^^))- Therefore he calls the smoothing formula 
(4.12) the minimum formula if p is taken such that (4.30) is minimal. Greville |24| (4.2)] 
gives an explicit formula for the characteristic function (j) in case of a minimum formula. 
He ascribes this formula to Sheppard |61j (1913). We will derive this formula in the Appendix. 
Greville 



^5] next proves the stability property (4.20) for these cases. Curiously, Greville 
does not mention Hahn polynomials in any way. Hahn polynomials in this context seem to come 
up first in Bromba & Ziegler |7j §3.2]. 

As a special case of j36| (9.5.16)] there is the limit formula 



lim Qn{x + N- a, a, 2A^) = Kn{x + N-\,2N)= 2F1 



-n, —N — X 
-2N 



;2 



(4.31) 



where the polynomials x 1— )• Kn{x;p, N) are Krawtchouk polynomials (see [361 §9.11]). The 
corresponding weights (4.29), suitably normalized, tend for a — )• 00 to the symmetric weights 

{x = -N, -N + l,...,N), 



2N 

N + x 



with respect to which the polynomials Pn{x) = Kn{x + N; 2A^) are orthogonal. The smoothing 
formula with p given by (4.18) for this pn and w is called the minimum Roo formula by Greville 
^6]. He obtains the characteristic function (f) for this case explicitly as a limit case of his 
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formula in the minimum Ra case (see also (A.9)), not working with Krawtchouk polynomials 
at all. (Krawtchouk polynomials seem to come up first in this context in Bromba & Ziegler [3 
§3.3].) But Greville also obtains in some way that, 

0W(7r) = O {k = 0,l,...,2N -2n-l). 

Let us prove this by observing from ( 4.31| ) that 

N+x 



(4.32) 



P2N[X) 



2FI 



-2N, -N -X 
-2N 



j=0 



N + x 
3 



i-^y = (-1) 



N+x 



Hence 



N 



N 



^('\7r)= ^ p(a:){-ixf{-l) 



K2nix,0)x'' P2Nix)Wx = 



x=-N 



x=-N 



for k < 2N — 2n by orthogonality. Now Greville concludes from ( |4.19[ ) and (4.32) that 

(t){uj) = 1 - (sin2(w/2))"+ip(sin2(w/2)) = {cos^{u/2))^-''Q{sm^{u/2)) (4.33) 

for certain polynomials P of degree — n — 1 and Q of degree n. From that he immediately 
derives that Q{z) is equal to the power series of (1 — z)"^"*"" truncated after the term with z", 
i.e., 



Q{z) 



t^.^>0 (0<z<l). 



fc=o 



k\ 



By a similar argument we see that 



p{z)= y: ^^^(i-^)^>o 



fc=0 



k\ 



(0 < z < 1). 



(4.34) 



(4.35) 



Hence, by (4.33), 



O<0(a;)<l (0<tJ<27r), 



which is even stronger than the stability condition (4.20). 

Identity (4.33) with Q and P given by (4.34), (4.35) has a long history which is surveyed in 
Koornwinder & Schlosser [39j. However, this paper missed Greville's paper and the connection 
with Krawtchouk polynomials. A sequel |l0] to [39], tracing ( 4.33 )-( 4.35) back to 1713, is in 
preparation. 



By a short chain of identities, using (4.33) and (4.34), 4>{ijj) can be expressed as 
(j){uj) = {N -n) 



cos2(w/2) 



,Af-n-l 



(1 -s)"£is. 



see [33 top of p. 249]. Hence (/> is monotonically decreasing on [0, vr] from 1 to 0. Such filters 
without ripples are called maximally flat by Herrmann [31j (see also Samadi and Nishihara 
|56j). Herrmann gave the same argument as above for solving (4.33), apparently unaware of 
Greville 
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4.7 The Savitzky-Golay paper and its follow-up 



The first instance of an approximation of first and higher derivatives by formula (4.1) with 
possibly n > m was given by Savitzky & Golay [1^ in 1964. They only dealt with the case 
of constant weights on {— A'^, — + 1, . . . , A^}, they used only very special A^, n and m, and 
they did not explicitly mention or use the corresponding orthogonal polynomials. They were 
motivated by applications in spectroscopy. Their paper had an enormous impact, for instance 
5432 citations in Google Scholar in January 2012. Some corrections to |57j were given by Steinier 
et al. [62J in 1972. 

Probably, Gorry [22j (1990) was the first who gave (4.1 ) in a more structural form in the case 
of constant weights on an equidistant set using centered Gram polynomials. Next, in |23J (1991) 
he considered (4.1) on a finite non-equidistant set, still with constant weights. Meer & Weiss 
|47j (1992) gave (4.1 ) for orthogonal polynomials on a set {— A^, — A^ + 1, . . . , A^} with respect to 
general weights. They made this more explicit in the cases of centered Gram polynomials and 
of centered Krawtchouk polynomials with symmetric weights. 

We recommend Luo et al. [44j as a relatively recent survey of follow-up to the Savitzky-Golay 
paper. 



5 Filter properties in the frequency domain: some examples 

In section |4] many so-called Zinear filters for derivatives were mentioned. In electrical engineering 
one uses the transfer function H for understanding the properties of the filter. This is the Fourier 



transform of the unit impulse response of the filter, see (4.9), (4.10) where we wrote (f> instead 
of H. In general, this function is complex- valued. We can show the properties of the filter in 
the frequency domain by a log-log plot of the modulus of the transfer function (which may be 
complemented by a phase plot). In general, for a differentiation filter of order n, H{uj) should 
behave for low frequencies like w", and for high frequencies like a constant (equal to zero in the 
ideal case). When the behaviour is different, the filter is called unstable. 

5.1 The Lanczos derivative 



The (analog) filter corresponding to the Lanczos derivative is given by (3.7) ignoring the limit: 

. ^ 3 



fj{x + midi. (5.1) 



The output function g can be considered as a continuous (i.e. unsampled) approximation of the 
first derivative of the input function /. The transfer function H{lo) is equal to the quotient of 
g and / with f{y) := e*"^^. A short computation gives 

H{lo) = ^ f e'^^^ idi = iLo — ^ ( s\n{6Lo) - 6uj cos(5oj)) (5.2) 
2o J_i [ouj)-^ \ ) 

= iw(l + 0{b^u?)) as bu: i 0, 
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compatible with (4.11) for m = 1, n = 2. For small uj5 we have arg(if(a;)) = tt/2. The modulus 
of H[uj) for (5 = 1 is given in Figure [T| case n = 1 as a log-log plot. 



5.2 Multi-term variant of the Lanczos derivative 



To get a better approximation we can use equation (4.1) with pn a Legendre polynomial: 

[(n-m)/2] „{m) 



Xm / J 



m+2k 



(0) 



k=0 



h 



m+2k J-1 



f{x + 60Pm+2k{0 dC. 



(5.3) 



Here 



r,(m) 
m+2k 



(0) 



^2)iTi-+k 



'm + 2k+ 



h 



m+2k 



k\ 



by [ISl 10.10(26), 10.9(19)] a nd (|2lol ). For m = n = 1 ([53]) reduces to For the transfer 

function we obtain (see also (4.11)): 



[(n-m)/2] 



l)'(5)m+fe("i + 2A; + i) 



fc=0 

[(n-m)/2] 

.2''™^ 



A:! 



e'^'^P^+2k{i)di 



fc=0 



k\ 



jm+2k{Suj) 



where the spherical Bessel funtions jm+2k entered by (4.25). An explicit formula for spherical 
Bessel functions is given in [5U| (10.49.2)]. In particular, 

1 



Ji(^) 



sm z — z cos z 



1 



(15 — 6z^) sin z — (15 — z'^)z cos z ) . 



Thus Hi^i{uj) is given by (5.2). After some computation we obtain 

15iuj (21 - 8{6ujf) sm{6u;) + {-216uj + (^w)^) cos{6uj) 



(5.4) 



The modulus of Hi^^(uj) for (5 = 1 is also given in figure [T| case n = 3. It is clear that for n = 3 
the plot stays close to a straight line until higher values of oj than for n = 1. 



5.3 First order Savitzky-Golay filter 

When the input signal of the filter is given by a vector of (for convenience) odd dimension 2A^ + 1 
obtained by sampling a function / on equidistant points x — N5, x — (N — 1)6, . . . ,x + N6, then 
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N 



we may use (3.4) ignoring the limit as a discrete filter for the n-th derivative of / at x. In 
particular, for n = 1, we can use (3.5): 

gix) = 
For the transfer function 



2N{N +1){N + 1)6 



- E fi- + ^ot 



N 



2N{N + \){N + l)6 



we obtain by straightforward computation that 



3i 



sm 



-2 (sui{N5uj) sin((iV + l)5uj) 



N 



N + 1 



(5.5) 



(5.6) 



2{2N + l)5 

Note that the phase shift is exactly it/2. 

The modulus of H[uj) for 5 = 1 and for = 1 and 2 is given in Figure [2j See |441 Figure 1] 
for similar pictures. 



5.4 Butterworth filter 

If one needs a filter that does differentiation for low frequencies very well and has a good 
suppression for high frequencies then there are better filters than the ones discussed in this 
paper until here. For example there are the so-called Tchebyshev, inverse Tchebyshev, Elliptic, 
Butterworth and Bessel filters. These filters all differentiate, but the choice of the most suitable 
filter depends on the properties one needs, for instance a constant phase response, a good 
amplitude response, less side-lobes etc. We mention here the so-called n-th order Butterworth 
filter. The square of the modulus of the transfer function of an n-th order analog Butterworth 
filter that differentiates with order m is given by 

Hm,n{^) = -,^1 , ,2n =^'"l-^o.»Ml' {n>m). (5.7) 
1 + (w/wo) 
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0.50 - 




0.05 - X 
0.02 - y 

0.05 0.10 0.50 1.00 

Figure 2: Modulus of transfer function for the first order discrete filter, = 1 and 2 

Here is the so-called cutoff frequency. It is at this frequency coq where the the asymptotics of 
the low frequency part and the high frequncy part of the transfer function meet. The factor w^"* 
is the square of the modulus of the transfer function {ioj)^ of the ideal m-th order differentiator. 

As an example see Figure [3] showing the transfer function of a seventh order Butterworth 
filter with ojq = 1 (see how the side lobes differ from those of Figure [2]) . 

1.00 r 




0.05 0.10 0.50 1.00 



Figure 3: Transfer function for seventh order Butterworth filter 



In (5.7) one has to make a choice of Hm,n{^) as follows: 



with \pn{iuj)\ = 1 + {uj /coq) 



2n 



such that pn is a polynomial of degree n with real coefficients for which all (possibly complex) 
roots have negative real part. Then (4.10) and (4.8) take the form 

i/o,n(^)=/ p{t)e~'^'dt, g{t)= f{t-T)p{r)dT. 
Jo Jo 

p{t) is called the impulse response of the filter. It follows that the output function g satisfies a 
differential equation with the input function / as inhomogeneous part: 

Pn{d/dt)g{t) = f{t). 
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For instance, for n = 1 we have 

piM = l + iw/wo, =woe-^°*, C0o^g'{t) + g{t) = f{t). 
For n = 2 we have 

P2{iuj) = 1 + 2^/2 (icj/wo) + = 2^/2 wo e-^^'^'^'o* sin(2-i/2^oi)- 

One can obtain the transfer function for the Butterworth filter in the digital case from 
-f^o,n(w) in the analog case by so-called frequency warping: replace uj by 2T~^ tan(a;r/2), where 
T is the length of the sampling interval. Then some linear combination of finitely many output 
values g{x),g{x — T), . . . will be equal to some linear combination of finitely many input values 
/(x), f{x — T), ... (a so-called recursive filter). 

There are important differences for practical applications between filters obtained from or- 
thogonal polynomials, as amply considered in this paper, and the Butterworth filter. In the 
analog case the Butterworth filter can be much easier constructed physically. But in the dis- 
crete case the filters obtained from orthogonal polynomials are much easier to handle in the time 
domain than the Butterworth filter. 

For more details about the Butterworth filter for m = see Oppenheim & Schafer |5H 
§5.2.1] (both analog and digital), Johnson [35l §3.2] (analog) and Hamming [271 §12.6] (digital). 
See Herely &: Vetterli [301 §IV-A] and Cohen & Daubechies jl2l §6.2] for Butterworth filters in 
connection with wavelets. 



A Appendix 



Here we derive an explicit expression for the Fourier-Bessel functions (see (4.27)) associated 



with the shifted symmetric Hahn polynomials (see (4.28)). First observe that by the duality 



between Hahn polynomials and dual Hahn polynomials (see the formula after (9.6.16) in [36]) 
the generating function [36, (9.6.12)] for dual Hahn polynomials can be rewritten in terms of 
Hahn polynomials: 



where 



-/3 - iV 



Wx :-- 



N 



x=Q 



a + x\ / f3 + N — X 



X / \ N — X 

i.e., the weight occurring in the orthogonality relation [36t. (9.5.2)] for Hahn polynomials. 



(A.2) 



Next, in (|A.l|) take f3 = a, t := e"*^, replace by 2A^, shift a; to x and apply Pfaff 's 



identity [B (2.3.14)]. Then 



(2a + 2)2N+n ^iNe^^ _ ^-ie^n 



22"(2Ar)!(a + I), 



n-2N,n + a + l 
' 2n + 2a + 2 ' 



N 



x=-N 



(A.3) 



29 



where pn{x) and Wx are given by (4.28) and ( 4.29| ), respectively. 

Now use the quadratic transformation [17, 2.11(30)] and the expression j.181 10.9(20)] of 
Gegenbauer polynomials in terms of hypergeometric functions: 



2F1 



n - 2N, n + a + l 
2n + 2a + 2 



Thus we can rewrite (A. 3) as 
(a + 1) 



N 



-2N) 



ixB 



x=-N 



(A.4) 



The left-hand side of (A.4) gives an expression for the Fourier-Bessel function (4.27) associated 
with the shifted symmetric Hahn polynomials (4.28). 



Now let (j) be defined by (4.23), (4.22) with p„ given by (4.28). Then combination of (4.24), 



(jAJ) and [ISl 10.9(22)] yields that 



-N + n + 1,N + n + a + 2 
2n + a + i 



; sin^(2C<j) 



(A.5) 



where 



C = (-1)"+^ 2^"+^ k2nP2niO) (« + l)2«+i (4n + 2a + 4) 



2N-2n-l 



k2n+lh2n (-2iV)2n+l (2iV - 2n - 1)! 



(-1) 



{N + a + l)n+ii-N)n+i 



(n + a + |)n+in! 



(A.6) 



In the last equality we used [36i (9.5.2), (9.5.4)] for hn and kn and [ISl (2.4)] together with 
(9.5.3)] for getting 

(l)^(N + a + l)n 
{-N + ^)„(a + l)„ 



Formulas (A.5), (A.6) coincide with formulas (4.3), (4.4) in Greville |24j if we replace N,n,a by 
n,k,m, respectively. Integration of (A.6) together with (/)(0) = 1 yields 



0(a;) = 1 + 



N 



{N + a + l)k{-N)k (sm^lio)) 



k 



(A.8) 



Formula (A.8) coincides with formula (4.2) in Greville [21], which Greville (in his earlier form 
(4.1)) ascribes to Sheppard [61]. However, we have only been able to find a match of the special 
case m = of [24', (4.1)] with a formula in Sheppard's paper, namely with [61( (64)]. 
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In the limit for a — )• oo formula (A.8) becomes 

N 



k=n+l 



{k-n-iy. 



k 



(A.9) 



This coincides with the formula after (6.1) in Greville [23], and also with (4.33) combined with 



(4.35) and tfl (2.3.15)]. 



If a € Z>o then the left-hand side of (A.4) can be written as a finite sum, where the number 



of terms is independent of A^. First observe that by [50, (14.13.1), (14.3.21), (5.5.5)] we have 



(A > 0, < 6* < vr). (A.IO) 



For A G Z>o the above series terminates after the term with k = A — 1. Hence, for a G 



(A.4) takes the form 
2A^ + a 



2i 



n+2a 



a 



(2sin(i^))-"-^"-^ ^ ^"'^ - 2a - n - 1). 



fc=0 



A; 



(_2iV-a)fc 



JV 



sin(i(2A^ + 2a + n-2A; + l)6') = ^ w^Pn{x) e'""^ . (A.ll) 



For a = and n = 1 formula (A.ll) specializes to (5.6) with H{uj) given by (5.5). Use that 
Qi(A^ + x;0,0,2A^) = x/N. 
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